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We present the linear first-order intermediate language IL for verified com¬ 
pilers. IL is a functional language with calls to a nondeterministic environ¬ 
ment. We give IL terms a second, imperative semantic interpretation and 
obtain a register transfer language. For the imperative interpretation we 
establish a notion of live variables. Based on live variables, we formulate 
a decidable property called coherence ensuring that the functional and the 
imperative interpretation of a term coincide. 

We formulate a register assignment algorithm for IL and prove its correct¬ 
ness. The algorithm translates a functional IL program into an equivalent 
imperative IL program. Correctness follows from the fact that the algorithm 
reaches a coherent program after consistently renaming local variables. We 
prove that the maximal number of live variables in the initial program bounds 
the number of different variables in the final coherent program. The entire 
development is formalized in Coq. 


1 Introduction 

We study the intermediate language IL for verified compilers. IL is a linear functional 
language with calls to a nondeterministic environment. 

We are interested in translating IL to a register transfer language. To this end, we give 
IL terms a second, imperative interpretation called IL/I. IL/I interprets variable bind¬ 
ing as assignment, and function application as goto , where parameter passing becomes 
parallel assignment. 

For some IL terms the functional interpretation coincides with the imperative inter¬ 
pretation. We call such terms invariant. We develop an efficiently decidable property we 
call coherence that is sufficient for invariance. To translate IL to IL/I, translating to the 
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coherent subset of IL suffices, i.e. the entire translation can be done in the functional 
setting. 

The notion of a live variable is central to the definition of coherence. Liveness analysis 
is a standard technique in compiler construction to over-approximate the set of variables 
the evaluation of a program depends on. Coherence is defined relative to the result of a 
liveness analysis. 
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Figure 1: Program (a) and (b) computing F(n, m) := n* (n + 1) * ... * m 

Inspired by the correspondence between SSA [8] and functional programming [10, 2], 
we formulate a register assignment algorithm [9] for IL and show that it realizes the 
translation to IL/I. For example, the algorithm translates program (a) to program (b). 
Correctness follows from two facts: First, register assignment consistently renames pro¬ 
gram (a) such that the variable names correspond to program (b). Second, program (b) 
is coherent, hence let binding and imperative assignment behave equivalently. Parameter 
passing in IL/I can be eliminated by inserting parallel assignments [9]. In program (b), 
all parameters i, n can simply be removed, as they constitute self-assignments. 

A key property of SSA-based register assignment is that the number of imperative 
registers required after register assignment is bounded by the maximal number of simul¬ 
taneously live variables [9], which allows register assignment to be considered separate 
from spilling. We show that our algorithm provides the same bound on the number of 
different variable names in the resulting IL/I term. 


1.1 Related Work 

Correspondences between imperative and functional languages were investigated already 
by Landin [11], The correspondence between SSA and functional programming is due to 
Appel [2] and Kelsey [10] and consists of a translation from SSA programs to functional 
programs in continuation passing style (CPS) [15, 1]. Chakravarty et al. [6] reformu¬ 
late SSA-based sparse conditional constant propagation on a functional language in 
administrative normal form (ANF) [16]. Our intermediate language IL is in ANF, and a 
sub-language (up to system calls) of the ANF language presented in Chakravarty et al. 
[ 6 ], 

Two major compiler verification projects using SSA exist. CompCertSSA [3] integrates 
SSA-based optimization passes into CompCert [13]. VeLLVM [18, 17] is an ongoing effort 
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to verify the production compiler LLVM [12]. Both projects use imperative languages 
with (^-functions to enable SSA, and do not consider a functional intermediate language. 
As of yet, neither of the projects verifies register assignment in the SSA setting. In the 
non-SSA setting, a register allocation algorithm, which also deals with spilling, has been 
formally verified [5]. 

Beringer et al. [4] use a language with a functional and imperative interpretation for 
proof carrying code. They give a sufficient condition for the two semantics to coin¬ 
cide which they call Grail normal form (GNF). GNF requires functions to be closure 
converted, i.e. all variables a function body depends on must be parameters. 

Chlipala [7] proves correctness for a compiler from Mini-ML to assembly including 
mutable references, but without system calls. Register assignment uses an interference 
graph constructed from liveness information. Chlipala restricts functions to take exactly 
one argument and requires the program to be closure converted prior to register assign¬ 
ment. This means liveness coincides with free variables and values shared or passed 
between functions reside in an (argument) tuple in the heap: Effectively, register as¬ 
signment is function local. Chlipala does not prove bounds on the number of different 
variables used after register assignment and does not investigate the relationship to 
a-equivalence. 

1.2 Contributions and Outline 

• We formally define the functional intermediate language IL and its imperative 
interpretation, IL /I. We establish the notion of live variables via an inductive 
definition. We identify terms for which both semantic interpretations coincide via 
the decidable notion of coherence. 

• Inspired by SSA-based register allocation, we formulate a register assignment al¬ 
gorithm for IL and prove that it realizes an equivalence preserving transformation 
to IL/I. We show the size of the maximal live set bounds the number of names 
after register assignment. 

• All results in this paper have formal Coq proofs, and the development is available 
online (see Section 9). We omit proofs in the paper for space reasons. This version 
contains an appendix. 

The paper is structured as follows: We introduce the languages in Section 2 and Section 3. 
Program equivalence is defined in Section 4. We define invariance in Section 5, estab¬ 
lish a notion of live variables in Section 6, and present coherence in Section 7. Register 
assignment is treated in Section 8. 

2 IL 

Values, Variables, and Expressions We assume a set V of values and a function j3 : 
V —> {0,1} that we use to simplify the semantic rule for the conditional. By convention, 
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rj ::= e \ a 

Term 3 s,t ::= let a; = 77 in s 


if ethen s elset 

e 

fun f x = s in t 

fe 


extended expression 
variable binding 
conditional 
value 

function definition 
application 


Figure 2: Syntax of IL 


v ranges over V. We use the countably-infinite alphabet V for names x, y, z of values, 
which we call variables. 

We assume a type Exp of expressions. By convention, e ranges over Exp. Expressions 
are pure, their evaluation is deterministic and may fail, hence expression evaluation is 
a function [■] : Exp —> (V —> Y±) Y±. Environments are of type V —> Y± to 

track uninitialized variables. We assume a function fv : Exp -3 set V such that for all 
environments V. V' that agree on fv(e) we have [e] V = [ej V. We lift [•] pointwise to 
lists of expressions in a strict fashion: [ej yields a list of values if none of the expressions 
in e failed, and X otherwise. 

Syntax IL is a functional language with a tail-call restriction and system calls. IL 
syntactically enforces a first-order discipline by using a separate alphabet F for names 
/, g of function type, which we call labels. IL uses a third alphabet A for names a which 
we call actions. The term let rc = a in ... is like a system call a that non-deterministically 
returns a value. The formal development treats system calls with arguments. Their 
treatment is straightforward and omitted here for the sake of simplicity. 

IL allows function definitions, but does not allow mutually recursive definitions. The 
syntax of IL is given in Figure 2. 

Semantics The semantics of IL is given as small-step relation —> in Figure 3. Note 
that the tail-call restriction ensures that no call stack is required. The reduction relation 
—> operates on configurations of the form ( F , V, s ) where s is the IL term to be 
evaluated. The semantics does not rely on substitution, but uses an environment V : 
V -3 Vj_ for variable definitions and a context F for function definitions. Transitions 
in —> are labeled with events (j). By convention, if; ranges over events different from r. 


£ 3 (f> ::= t 


v = a 


A context is a list of named definitions. A definition in a context may refer to previous 
definitions and itself. Notationally, we use contexts like functions: If a context F can be 
decomposed as F\] f : a; F 2 where / ^ domF 2 , we write Ff for a and F-f for F\ \ f : a. 
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F \V | Si 
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_ v eV _ 

F | V | let x = a in s F \ V[x i—>■ v] | s 
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F 

F',f : (V,x, s) 


V | fun f x = s in t 

V 1 1 


App 

mV = v Ff = (V',x,s ) 
F |F |/e 

Ff \V’[x^v\ | s 


Figure 3: Semantics of IL 


Otherwise, Ff = _L. To ease presentation of partial functions, we treat / : _L as if / was 
not defined, i.e. / $ dom(f : _L). We write 0 for the empty context. 

A closure is a tuple (V,x,s) € C consisting of an environment V, a parameter list 
x, and a function body s. Since a function / in a context F; f : ...; F' can refer to 
function definitions in F (and to itself), the first-order restriction allows the closures to 
be non-recursive: function closures do not need to close under labels. An application fe 
causes the function context F to rewind to F*, i.e. up to the definition of / (rule App). 
In contrast to higher-order formulations, we do not define closures mutually recursively 
with the values of the language. 

A system call let x = a ins invokes a function a of the system, which is not assumed 
to be deterministic. This reflects in the rule Extern, which does not restrict the result 
value of the system call other than requiring that it is a value. The semantic transition 
records the system call name a and the result value v in the event v = a. 

IL is linear in the sense that the execution of each term either passes control to a 
strict subternr, or applies a function that never returns. This ensures no run-time stack 
is required to manage continuations. While, by contrast, uses sequentialization ; to 
manage a stack of continuations. 

3 Imperative Interpretation of IL: IL/I 

We are interested in a translation of IL to an imperative language that does not require 
function closures at run-time. We introduce a second semantic interpretation for IL 
which we call IL/I to investigate this translation. IL/I is an imperative language, where 
variable binding is interpreted as imperative assignment. Function application becomes a 
goto , and parameter passing is a parallel assignment to the parameter names. Closures 
are replaced by blocks (x, s) € B and blocks do not contain variable environments. 
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Consequently, a called function can see all previous updates to variables. For example, 
the following two programs each return 5 in IL/I, but evaluate to 7 in IL: 


1 let x = 7 in 

2 fun f () = x in 

3 let x = 5 in f () 


1 let x = 7 in 

2 fun f () = x in 

3 fun g x = f() in 

4 let y = 5 in g y 


To obtain the IL/I small-step relation 
by the following rules: 


we 


replace the rules F-Let and F-App 


I-Let 


T 



L 

L; / : (; x , s) 


V | fun f x = s in t 

V It 


I-App 

|[e] V = v 



Lf = (x,s) 

V |/e 

V[x i->- v\ | s 


4 Program Equivalence 


To relate programs from different languages, we abstract from a configuration’s internal 
behavior and only consider interactions with the environment (via system calls) and 
termination behavior. IL’s reduction relation forms a labeled transition system (LTS) 
over configurations. 

Definition 1 A reduction system (RS) is a tuple (£,£, —>,r, res), s.t. 


(1) (£, S, — >) is a LTS (3) res : £ -A V ± 

(2) t £ £ (4) res o = v => o —^-terminal 

An internally deterministic reduction system (IDRS) additionally satisfies 

(5) <7 o\ A a 02 =>■ o\ = o 2 action-deterministic 

(6) ex cti A cr —^ <72 =>■ (j> = t T-determiiustic 


4.1 Partial Traces 


We consider two configurations in an IDRS equivalent, if they produce the same partial 
traces. A partial trace it adheres to the following grammar: 

II 9 7T ::= e | v \ _L | ifm 


We inductively define the relation > C £ x II such that <7 > ir whenever <7 produces the 
trace n. In the following, we write trace for partial trace. 


Tr-Tau Tr-End 

T / / 

(7 - > O O > 7T 

a > 7r (7 > e 


Tr-Trm 
u —^-terminal 
o > res cr 


Tr-Evt 

^ 

(7 - > O O > 7T 

(7 > l/j, TT 
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The traces a configuration produces are given as Vo = { 7 r | o i>ir}. 

Definition 2 (Trace Equivalence) o ~ o' : <*==> Vo = TV 

Lemma 1 0 silently diverges if and only if Vo = {e}. 

4.2 Bisimilarity 

We give a sound and complete characterization of trace equivalence via bisimilarity. 
Bisimilarity enables coinduction as proof method for program equivalence, which is more 
concise than arguing about traces directly. We say a configuration o is ready if the next 
step is a system call. We write 02 o\ for Mo \, o\ o\ = 4 > 3 o 2 ,02 o ' 2 A o\ R o 2 . 

We write o J) w (where w G Vx) if o — >* o' such that o' is —^-terminal and res (o') = 
w. 

Definition 3 (Bisimilarity) Let (S, £, —>, res,r) be an IDRS. Bisimilarity ~ C S x S 
is coinductively defined as the greatest relation closed under the following rules: 

Bisim-Term 
Ol JJ- W 02 J) w 
0 \ ~ 0-2 

/ ~ ~ / 

0 1 cr 2 o 2 -w cr 1 

o\ ~ 02 


Bisim-Silent 

o 1 — > + o\ 02 — > + o 2 o( ~ o 2 

Ol ~ (72 

Bisim-Extern 

o\ — >* o( 02 —>* o 2 o\. o 2 ready 


Bisim-Silent allows to match finitely many steps on both sides, as long as all transitions 
are silent. This makes sense for IDRS, but would not yield a meaningful definition 
otherwise. Bisim-Extern ensures that every external transition of o( is matched by 
the same external transition of o 2 , and vice versa. This ensures that if two programs 
are in relation, they react to every possible result value of the external call in a bisimilar 
way. The premises that o (, o 2 are ready is there to simplify case distinctions by ensuring 
that the next event cannot be r. 

Theorem 1 (Soundness and Completeness) Let (S,£, —>, res,r) be an IDRS and 
o, o' € S. Then: 0 ^ 0 ' o ~ o' 

The semantics of IL and of IL/I each forms an IDRS. We define res such that res(o) = 
v if o is of the form (F,V,e) and [e] V = v. Otherwise, res(o) = _L. The definitions for 
IL/I are analogous. To relate configurations IL to IL/I, we form a reduction system on 
the sum Ep + E/ of the configurations and lift —> and res accordingly. It is easy to see 
that the resulting reduction system is internally deterministic. If not clear from context, 
we use an index of, oj to indicate which language a configuration belongs to. 
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5 Invariance 

We call a term invariant if it has the same traces in both the functional and the imper¬ 
ative interpretation. 

Definition 4 (Invariance) A closed program s is invariant if 

VV, (0,V, S ) F ~(0,V, S ) 7 

Invariance is undecidable. We develop a syntactic, efficiently decidable criterion sufficient 
for invariance, which we call coherence. Coherence simplifies the translation between IL 
and IL/I. 

Coherence is based on the observation that some IL programs do not really depend 
on information from the closure. Assume Ff = (V',x,s) and consider the following IL 
reduction according to rule App: 

(F,VJe) —► (Ff,V'[x^v],s) 

If V agrees with V' on all variables X that s depends on, then the configuration could 
have equivalently reduced to (Ffi V[x i->- U],s). This reduction does not require the 
closure V and is similar in spirit to the rule I- App. Coherence is a syntactic criterion 
that ensures V and V' agree on a suitable set X at every function application. We 
proceed in two steps: 

1. Section 6 introduces the notion of live variables, which identifies a set that contains 
all variables a program depends on. 

2. Section 7 gives the inductive definition of coherence and shows that coherent pro¬ 
grams are invariant. 


6 Liveness 

A variable x is significant to a program s and a context L, if there is an environment V 
and a value v such that ( L,V,s)j 'fit (L,V[x i->- u],s)/. Significance is not decidable, as 
it is a non-trivial semantic property. 

Liveness analysis is a standard technique in compiler construction to over-approximate 
the set of variables significant to the evaluation of an imperative program. While usual 
characterizations of live variables rely on data-flow equations [14], we define liveness 
inductively on the structure of IL’s syntax. To the best of our knowledge, such an 
inductive definition is not in literature. The inductive definition factorizes the correctness 
aspect from the algorithmic aspect of liveness analysis. 

We embed liveness information in the syntax of IL by introducing annotations for 
function definitions: The term fun fx : X = sini is annotated with a set of variables 
X. 
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6.1 Inductive Definition of the Liveness Judgment 

We define inductively the judgment live, which characterizes sound results of a liveness 
analysis. 


A 

A b live s : X where X 

s 


context (set V) liveness for functions 
setV live variables 

Exp expression 


The predicate A b live s : X can be read as X contains all variables significant to s in 
any context satisfying the assumptions A. The context A records for every function / a 
set of variables X that we call the globals of /. Assuming x are the parameters of /, 
we will arrange things such that the set IUS contains all variables significant for the 
body of /, but never a parameter of /: X n x = 0. Throughout the paper, A is always 
a (partial) mapping from labels to globals, and X denotes a set of variables. 


Live-Op 

f v(v) ^ X x ex' 

x' \ {x} C X A b live s : X' 
A b live let x = rj i n s : X 


Live-Exp 
fv(e) C X 
A b live e : X 


Live-App 

Xi C X fv(e) C X 
A; / : X\ ; A' b live / e : X 


Live-Cond 
fv(e) C X 

Xi U X 2 C X 


A b live si : Xi 
A b live s 2 : X 2 


A b live if e then si else S 2 : X 


Live-Fun 

A; / : Xi b live si : Xi U x Xi fl x = 0 

A; / : Xi b live s 2 : X 2 _ X 2 C X 

A b live fun fx : X\ = si in s 2 : X 


Figure 4: Liveness: An approximation of the significant variables for IL/I 


6.1.1 Description of the Rules. 

Live-Op ensures that all variables free in r\ are live. Every live variable of the con¬ 
tinuation s except x must be live at the assignment. We require x to be live in the 
continuation. Live-Cond ensures that the live variables of a conditional at least con¬ 
tain the free variables of the condition, and the variables live in the consequence and 
alternative. Live-Exp ensures that for programs consisting of a single expression e at 
least the free variables of e are live. Live-App ensures that the free variables of every 
argument are live, and that the globals X± of / are live at the call site. Live-Fun 
records the annotation X\ as globals for / in A, ensures that X\ Ur is a large enough 
live set for the function body, and that X\ does not contain parameters of /. The live 
variables X 2 of the continuation t must be live at the function definition. 

Theorem 2 (Liveness is Decidable) For all A, X and annotated s, it is efficiently 
decidable whether A b live s : X holds. 
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The proof of Theorem 2 is constructive and yields an efficient, extractable decision pro¬ 
cedure. The decision procedure recursively descends on the program structure, checking 
the conditions of the appropriate rule in every step. 

6.2 Liveness Approximates Significance 

We show that the live variables approximate the significant variables. We write L \= A 
if a context L satisfies the assumptions A, and define: 

LiveCtxI 

L\= A X fl x = 0 A;/:Ih live s : X U x LiveCtx2 
L;/:(v)M;/:I 0M 

LiveCtxI ensures that X does not contain parameters and that X Ux is a large enough 
live set for the function body s under the context A; / : X. 

We can now formally state the soundness of the live predicate. We prove that if 
A b live s : X, then X contains at least the significant variables of s in every context L 
that satisfies the assumptions A. We write V =x V' if V and V' agree on X, that is if 
Vx eX,Vx = V'x. 

Theorem 3 For every program s, if A b lives : X and L \= A and V =x V', then 
(L,C,s)/ ~ (L,W,s)/. ' 


7 Coherence 

Coherence is a syntactic condition that ensures that a program is invariant. Coherence 
is defined relative to liveness information A b lives : X. 

In the following programs, the set of globals of / is {x}. The program on the left is 
not invariant, while the program on the right is coherent. 

1 let x = 7 in i let x = 7 in 

2 fun f () : {x} = x in i fun f () : fx} = x in 

3 let x = 5 in f () 3 let y = 5 in f () 

In the program on the left in line 3, the value of x is 5 and disagrees with the value of 
x in the closure of /. In the program on the right, x was not redefined, hence both IL 
and IL/I will compute 7. We say a function / is available as long as none of /’s globals 
were redefined. The inductive definition of coherence ensures only available functions 
are applied. 


7.1 Inductive Predicate 


The coherence judgment is of the form A b coh s , where s is an annotated program 


and A is similar to the context in the liveness judgment. We exploit that contexts realize 
a partial mapping, and maintain the invariant that A maps only available functions to 
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their globals, and all other functions to _L. The inductive definition given below ensures 
that only available functions are applied. 

Coh-App 

A /^-L 

A b coh fy 

[A; f :X J A - b coh 5 

A b coh if a; then s elset A b coh fun f x : X — s in t 

7.1.1 Description of the Rules. 

Coh-Op deals with binding a variable x. Every function that has x as a global (i.e. 
x € A/) becomes unavailable, and must be removed from A. We write |_AJx to remove 
all definitions from A that require more globals than X. Trivally, [AJy = A. To remove 
all definitions from A that use x as global, we use LAJy\{a;}- 

Formally, the definition of [AJx exploits the list structure of contexts: 

L0Jx = 0 [K-f:X'\ x = [_K\x-J-.X' I'CI 

LA;/:TJ X = |_AJ x; / : T [XJ:X'\ x = IMx'J : ±- X' £ X 

Coh-App ensures only available functions can be applied, since A maps functions that 
are not available to _L. Coh-Fun deals with function definitions. When the definition of 
a function / is encountered, its globals X according to the annotation are recorded in A. 
In the function body s, only functions that require at most X as globals are available, 
so the context is restricted to [A; / : X\x- 

Theorem 4 (Coherence is Decidable) For all A and annotated s, it is efficiently 
decidable whether A b coh s holds. 

7.2 Coherent Programs are Invariant 

Given a configuration ( F,V,t ) such that Ff = (V',1c, s), the agreement invariant 
describes a correspondence between the values of variables in the function closure V' 
and the environment V. If the closure of / is available, the closure environment V' 
agrees with the primary environment V on /’s globals X: V' =x V. We write F, V \= A 
if V/ G domF n domA, V' =x V (where A/ = X and Ff = (y',x,s)). 

Function application continues evaluation with the function body from the closure. 
Assume Ff = (V',x,s) and consider the IL reduction: 

(F,V,fe) —> (Ff,V’[x^v] a ,s) 

If coherence is to be preserved, s must be coherent under suitable assumptions. We say 
A approximates A' if whenever A/ is defined, it agrees with A' and define A ^ A' : <^=b 
V/ € dom A, A/ = A'/. The context coherence predicate A b cohF 1 ensures that all 
function bodies in closures are coherent. It is defined inductively on the context: 


Coh-Op 

|_AJ V\{a;} b Coh S 


Coh-Exp 


COH-COND 
A b coh s 


A b coh let a; = ?/m s A b coh e 
Coh-Fun 

A b coh t 


A; / : X b coh t 
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8 Translating from IL/F to IL/I via Coherence 


CohC-Emp CohC-Bot 

_ A b coh F 

0 b coh 0 A; /: -L h coh F' f :b 


CohC-Con 

A' b live s : X Ui A] f \X < A' 
[A; / : X\ x b coh s Ah coh F 
A,f :Xh coh F-J :(V,x,s) 


CohC-Con encodes two requirements: First, the body of / must be coherent under the 
context restricted to the globals X of f (cf. Coh-Fun). Second, XU x must suffice 
as live variables for the function body s under some assumptions A' such that A; / : X 
approximates A'. Approximation ensures stability under restriction: A b cohF => 
[AJv b cohF. 

We define strip(V,x,s ) = (x,s) and lift strip pointwise to contexts. 

Theorem 5 (Coherence implies Invariance) Let A b coh s and A b cohi 7 and 
A' b lives : X such that A A'. Then for all V =x V' such that F,V |= A, it holds 
(F, V,s)f — ( strip F, V ', s)/. 

Theorem 5 reduces the problem of translating between IL/I and IL to the problem 
of establishing coherence. For the translation from IL to IL/I, it suffices to establish 
coherence while preserving IL semantics. Since SSA and functional programming corre¬ 
spond [10, 2], the translation from IL/I to IL can be seen as SSA construction [8], and 
the translation from IL to IL/I, which we treat in the next section, as SSA destruction. 


8 Translating from IL/F to IL/I via Coherence 

The simplest method to establish coherence while preserving IL semantics is a-renaming 
the program apart. A renamed-apart program (for formal definition see Subsection 11.3) 
is coherent, since every function is always available. The properties of a-conversion 
ensure semantic equivalence. 

We present an algorithm that establishes coherence and uses no more different names 
than the maximal number of simultaneously live variables in the program. This algo¬ 
rithm corresponds to the assignment phase of SSA-based register allocation [9]. The 
algorithm requires a renamed-apart program as input to ensure that every consistent 
renaming can be expressed as a function from V —>• V. We proceed in two steps: 

1. We define the notion of local injectivity for a function p : V —> V. We show that 
renaming with a locally injective p yields an a-equivalent and coherent program 
ps. 

2. We give an algorithm rassign and show that it constructs a locally injective p that 
uses the minimal number of different names. 

We introduce more liveness annotations before every term in the syntax, i.e. wher¬ 
ever a term s appeared before, now a term (X) s appears that annotates s with the set 
X. From now on, s,t range over such annotated terms. We define the projection 
[(X) s] = X. The annotation corresponds directly to the live set parameter X of the 
relation A b lives : X, hence it suffices to write A b lives for annotated programs. 
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8 Translating from IL/F to IL/I via Coherence 


8.1 Local Injectivity 

We define inductively a judgment p b inj s where p : V —>• V and s is an annotated 
program. We use the following notation for injectivity on X: 

f >->• X : Va :yeX, fx = fy^x = y 


The rules defining the judgement are given below and require p to be injective on every 
live set X annotating any subterm: 


Inj-Op 

p >—> X p b inj s 

p b inj ( X) let x = rj in s 


Inj-Val 
P>-X 
p b inj ( X ) e 


Inj-App 

P>-X 

p b inj ( X) fy 


Inj-Cond Inj-Fun 

p >—> X p b inj s p b inj t p >—► X p b inj s p b inj t 

p\- inj (X) if x then seise t p b inj (X)fun fx : X\ =sint 


Let Vb(s) be the set of variables that occur in a binding position in s, and fv(s) be 
the set of free variables of s. For our theorems, several properties are required: 

(1) The program must be without unreachable code, i.e. in every subterm fun fx = 
s in t it must be the case that / is applied in t. 

(2) A variable in Vs(s) must not occur in a set of globals in A. We define AC U : 

V/ € domA, AfCU. 

(3) A variable in Vb(s) must not occur in the annotation [s]. We write s C U if for 
every subterm t of s it holds that every x £ [f] is either in U or bound at t in s. 

For renamed-apart programs, these conditions ensure that the live set X in Inj-Fun 
always contains the globals X\ of / (cf. Live-App). 

Theorem 6 Let s be a renamed-apart program without unreachable code such that 
A b live s, AC fv(s) and s C fv(s). Then 

p\- inj s =>- p (|AJ[ s] ) b coh (ps) 

Theorem 6 states that the renamed program ps is coherent under the assumptions 
p (LAj w ), i.e. the point-wise image of |_AJ[ s ] under p. 

Renaming with a locally injective renaming produces an a-equivalent program (for 
formal definition see Subsection 11.2), and hence preserves program equivalence: 

Theorem 7 Let s be a renamed-apart program without unreachable code such that 
A b lives, A C fv(s) and s C fv(s). Let p ,d : V —> V such that p is the inverse of d on 
fv(s). Then p b inj s ==>■ p,d\~ ps s 
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9 Formal Coq Development 


8.2 A Simple Register Assignment Algorithm 

The algorithm rassign is parametrized by a function fresh : set V —> V of which we 
require freshX 0 X for all finite sets of variables X. Based on fresh, we define a function 
freshlist X n that yields a list of n pairwise-distinct variables such that ( freshlist X n ) D 
X = 0. The SSA algorithm must process the program in an order compatible with the 
dominance order to work [9]. In our case it suffices to simply recurse on s as follows: 


rassign p ((X) let x = rj in s) 
where y = fresh (p([s] \ {x})) 
rassign p ((X) if e then s elsef) 
rassign p (( X) e) 
rassign p ((X) fe) 
rassign p {{X) fun fx : X' =sinf) 
where y = freshlist (p([s] \ x )) \x 


rassign (p[x y]) s 

rassign (rassign ps)t 

P 

P 

rassign (rassign (p\x y]) s ) t 


We prove in Theorem 8 that the algorithm is correct for any choice of fresh and 
freshlist, as long as they satisfy the specifications above. 

Theorem 8 Let s be renamed-apart such that A b lives, A C fv(s) and s C fv(s). Let 
p be injective on [s]. Then: rassign ps\~ inj s. 


Our implementation of fresh implements the heuristic of simply choosing the smallest 
unused variable. Theorem 9 shows that for this choice of fresh, the largest live set 
determines the number of required names. We use S(k) to denote the set of the k 
smallest variables, and Vo(s) to denote the set of variables occurring (free or in a binding 
position) in s. 

Theorem 9 Assume freshX yields a variable less or equal to |X|. Let s be renamed- 
apart such that A L lives, A C fv(s) and s C fv(s). Let k be the size of the largest 
set of live variables in s, and rassign ps = p'. If p(fv(s)) C S(n) then p'(Vo(s)) C 
S(max{n, k}). 

We prove a slightly generalized version of Theorem 9 by induction on s. 


9 Formal Coq Development 

Each theorem and lemma in this paper is proven as part of a larger Coq development, 
which is available online 1 . The development extracts to a simple compiler that, for 
instance, produces program (b) when given program (a) from the introduction as input. 

The formalization uses De-Bruijn representation for labels, and named representation 
for variables. Notable differences to the paper presentation concern the treatment of 
annotations, the technical realization of the definition of liveness, and the inductive 
generalizations of Theorems 6-9. 

x http://www.ps.uni-saarland.de/~ sdschn/publications/lvc!5 
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10 Conclusion 


10 Conclusion 

We presented the functional intermediate language IL and developed the notion of co¬ 
herence, which provides for a canonical and verified translation between functional and 
imperative programs. We formulated an register assignment algorithm by recursion on 
the structure of IL that achieves the same bound on the number of required registers 
as SSA-based register assignment. Coherence allowed us to justify correctness without 
directly arguing about program semantics by proving that the algorithm a-renames to 
a coherent program. 


11 Appendix 

11.1 Table of Variable Names and Types 


Variable 

Type 

V 

set 

P 

V^{0,1} 

V 

V 

Exp 

set 

V 

set 

e 

Exp 

x, y, z 

V 

T 

set 

f,9 

T 

A 

set 

rj 

Exp + A 

a 

A 

Term 

set 

s, t 

Term 

V 

V-> Vl 

c 

set 

F 

context of C 

8 

set 

</> 

8 

T 

8 

B 

set 

L 

context of B 

S 

set 

CT 

E 

n 

set 

7T 

n 

e 

n 


comment 

set of values 

conversion to truth value 
value 

set of expressions 

set of variables 

expression 

variables 

set of lables 

labels 

set of actions 

extended expression 

action 

set of terms 

terms 

environment 
set of closures 

set of events 
event 

silent event 
set of blocks 

set of states (LTS) 
state, configuration 
set of partial traces 
partial trace 
empty trace 
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11 Appendix 


11.2 a- Equivalence 

We formalize a generalization of alpha equivalence as an inductively defined judgment 
p,dh s rS ~' Q, t where p, d : V —> V and s. t are terms. The mapping p describes how the 
free variables of s map to free variables of t, and d describes how the free variables of t 
map to free variables of s. If p, d b s t holds, then d is the inverse of p on fv(s), i.e. 

Mx € fv(-s), d(px) = x 

Symmetrically, p is the inverse of d on fv(i). 

The formalization assumes a similar judgment p,d\~Exp e ~a ^ f° r a-equivalence of 
expressions. The variable case of judgment for expressions explains how p and d are 
used: 

Alpha-Var 
px = y dy = x 

Pi d b Exp X ~ Q y 

Alpha-Var ensures that p maps x to y and d maps y to x. 

The other rules of the expression judgment are structurally recursive and we omit 
them. 

Alpha-Op Alpha-Val 

Pi d I ~Exp V rj p\x i y x'\, d[x' Hi]hs~ a s' p, d \- Exp e ~ a e' 

p, d h let x = r] in s let x' = rf in s' p,d\~ e e 

Alpha-Cond 

Alpha-App Pi d b s s' 

Vi, p,d b Exp O e'i p,d \~Exp e e' p, d b t t' 

p,d\~ fe ~ Q f e! p, d b if e then s else t ~ Q if e! then s' else t! 

Alpha-Fun 

p[x i-A x'],d[x' Hijbs ~ Q s' p, d b t t! \x\ = \x'\ 
p,d b fun / x = s in t ~ Q fun f x' = s' inf' 

Figure 5: Inductive judgment generalizing Q-equivalence 

The relation has several pleasant properties. 

Lemma 2 (Reflexivity) id, id b s s 

Lemma 3 (Symmetry) p, d b s s' => d, p b s' ~ a s 

Lemma 4 (Transitivity) p\,d\ b s ~ Q s' =$■ P 2 , d ,2 b s' ~ Q s" =A p\ o p 2 ,d 2 o d\ b 
s s' 
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11 Appendix 


We validate our definition and prove soundness with respect to trace equivalence ~. 
We define 

V = p ,d V' '■ •<=> Vary, px = y =>• dy = x ^>Vx = V'y 
We relate two closures in the following way: 

(V,x,s) = a (V\ x', s') 

:<^=^ \x\ = \x'\ A 3pd, V = Pjd V 7 A p[x i-A x'\,d[x' i-fijhs ~ a s' 

We then lift = a point-wise to contexts of the same length. 

Theorem 10 If F = a F' and V = p , d V then (F, V, s) ~ (F\ V, s'). 

In the formal development we have an additional formalization of IL which uses De- 
Bruijn representation also for variables (and not just for labels). We give a translation 
from the named IL to De-Bruijn IL, and prove this translation correct with respect 
to trace equivalence. We then show that terms that are a-equivalent by our inductive 
definition translate to identical terms in De-Bruijn representation. 

11.3 Definition of Renamed Apart 

A program is renamed apart, if every variable x occurring in a binding position does not 
occur free and x is different from every variable occurring in a different binding position. 
We formulate an inductive predicate X b s apart X' that ensures this property. The 
predicate maintains the invariant that all free variables of s are in X, and that X' 
contains exactly the variables occurring in binding positions in s. 


Apart-Op 

fv(e) C X XU {x} b s apart X' 

Apart-Val 
fv(e) C X 

X b let x' = ?/ in s' 

apart X' U {x} 

X b e apart 0 

Apart-App 
fv(e) C X 

Apart-Cond 
fv(e) C X 

x, n x t = 0 

Xbs apart X s 

X b t apart X t 

X b fe apart 0 

X b if e then s else t apart X s U X t 

Apart-Fun 

X b f apart X t 

X U x b s apart X s 

unique x 

xnx = 0 
(x s u7)nx ; = 0 


X b fun / x = s in t apart X s U X t U x 


Figure 6: Inductive definition of renamed apart 
Lemma 5 (Disjoint) If A b s apart X' then X n X' = 0. 

Lemma 6 (Relation to free and bound variables) If X b s apart X' and then fv(.s) C 
X and X' = Vs(a). 
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11 Appendix 


11.4 A Procedure to Rename Apart 

We define the procedure 

apart : (V —>■ V) —> ( setV) —>■ Exp —> ( setV) x Exp 

such that apart pXs = (A, s') ensures s' is renamed apart and a-equivalent to s. X’ 
contains the newly chosen variables now occurring in binding positions in s'. Theorem 11 
and Theorem 12 make these claims precise. 

apart pX (let x = rj in s) = (A"' U {y}, let y = prj in s') 

where (A', s') = apart {p[x y]) (X U {y}) s 

where y = fresh X 

apart pX (if e then s else t) = (X s U A t , if (pe) then s'elsef') 
where (X s , s') = apart pX s 
where (X t , t’) = apart p (A U X s ) t 
apart p X e = (0, pe) 

apart pX(fe) = (0,/(pe)) 

apart p X (fun fx = s in t) = (X 8 U X t U y, fun fy = s' in t') 

where y = freshlist X |x| 
where (X s , s') = apart (p\x >->■ y]) (X U y) s 
where (X t , t') = apart p ( X U X s U y) t 

Theorem 11 ( apart renames apart) Let s be a program such that p(fv(s)) C A and 
apart pX s = (X ', s'). Then: Ahs f apart X'. 

Theorem 12 (Renaming apart respects a-conversion) Let s be a program such 
that p(fv(s)) C A and apart pX s = (A ',s') and let d be inverse to p on fv(s). Then 
p, d h s ~ Q s'. 

11.5 Joining the Parts 

This section describes how the theorems proven in this paper fit together in a compiler. 
Assume that the compiler uses IL as an intermediate language, and now wants to produce 
code for an IL program s. The compiler procedes as follows: 

1. Rename si apart, obtaining an a-equivalent program S 2 (Theorem 12). 

2. Run the algorithm rassign on s 2 to obtain a register assignment p. Theorem 
Theorem 8 ensures p is locally injective. 

3. Rename S 2 accoding to p and obtain S 3 , which is a-equivalent (Theorem 7) and 
coherent (Theorem 6 ) because p is locally injective. 

4. Theorem Theorem 5 ensures that s 3 can be seen equivalently as an IL/I program, 
hence the functional program si has been translated to an imperative program s 3 . 
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